projection technique
Creating User-steerable Projections with Interactive Semantic Mapping
Oliveira, Artur André, Espadoto, Mateus, Hirata, Roberto Jr., Cesar, Roberto M. Jr., Telea, Alex C.
Dimensionality reduction (DR) techniques map high-dimensional data into lower-dimensional spaces. Yet, current DR techniques are not designed to explore semantic structure that is not directly available in the form of variables or class labels. We introduce a novel user-guided projection framework for image and text data that enables customizable, interpretable, data visualizations via zero-shot classification with Multimodal Large Language Models (MLLMs). We enable users to steer projections dynamically via natural-language guiding prompts, to specify high-level semantic relationships of interest to the users which are not explicitly present in the data dimensions. We evaluate our method across several datasets and show that it not only enhances cluster separation, but also transforms DR into an interactive, user-driven process. Our approach bridges the gap between fully automated DR techniques and human-centered data exploration, offering a flexible and adaptive way to tailor projections to specific analytical needs.
- Europe > Netherlands (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Portugal > Coimbra > Coimbra (0.04)
- Research Report (0.64)
- Overview (0.46)
Interactive Counterfactual Generation for Univariate Time Series
Schlegel, Udo, Rauscher, Julius, Keim, Daniel A.
We propose an interactive methodology for generating counterfactual explanations for univariate time series data in classification tasks by leveraging 2D projections and decision boundary maps to tackle interpretability challenges. Our approach aims to enhance the transparency and understanding of deep learning models' decision processes. The application simplifies the time series data analysis by enabling users to interactively manipulate projected data points, providing intuitive insights through inverse projection techniques. By abstracting user interactions with the projected data points rather than the raw time series data, our method facilitates an intuitive generation of counterfactual explanations. This approach allows for a more straightforward exploration of univariate time series data, enabling users to manipulate data points to comprehend potential outcomes of hypothetical scenarios. We validate this method using the ECG5000 benchmark dataset, demonstrating significant improvements in interpretability and user understanding of time series classification. The results indicate a promising direction for enhancing explainable AI, with potential applications in various domains requiring transparent and interpretable deep learning models. Future work will explore the scalability of this method to multivariate time series data and its integration with other interpretability techniques.
Understanding dimensionality reduction in machine learning models
Machine learning algorithms have gained fame for being able to ferret out relevant information from datasets with many features, such as tables with dozens of rows and images with millions of pixels. Thanks to advances in cloud computing, you can often run very large machine learning models without noticing how much computational power works behind the scenes. But every new feature that you add to your problem adds to its complexity, making it harder to solve it with machine learning algorithms. Data scientists use dimensionality reduction, a set of techniques that remove excessive and irrelevant features from their machine learning models. Dimensionality reduction slashes the costs of machine learning and sometimes makes it possible to solve complicated problems with simpler models. Machine learning models map features to outcomes.
Machine learning: What is dimensionality reduction?
Machine learning algorithms have gained fame for being able to ferret out relevant information from datasets with many features, such as tables with dozens of rows and images with millions of pixels. Thanks to advances in cloud computing, you can often run very large machine learning models without noticing how much computational power works behind the scenes. But every new feature that you add to your problem adds to its complexity, making it harder to solve it with machine learning algorithms. Data scientists use dimensionality reduction, a set of techniques that remove excessive and irrelevant features from their machine learning models. Dimensionality reduction slashes the costs of machine learning and sometimes makes it possible to solve complicated problems with simpler models.
Comprehensive Guide to 12 Dimensionality Reduction Techniques
Have you ever worked on a dataset with more than a thousand features? I have, and let me tell you it's a very challenging task, especially if you don't know where to start! Having a high number of variables is both a boon and a curse. It's great that we have loads of data for analysis, but it is challenging due to size. It's not feasible to analyze each and every variable at a microscopic level. It might take us days or months to perform any meaningful analysis and we'll lose a ton of time and money for our business! Not to mention the amount of computational power this will take. We need a better way to deal with high dimensional data so that we can quickly extract patterns and insights from it. So how do we approach such a dataset?
Deep Learning Multidimensional Projections
Espadoto, Mateus, Hirata, Nina S. T., Telea, Alexandru C.
Dimensionality reduction methods, also known as projections, are frequently used for exploring multidimensional data in machine learning, data science, and information visualization. Among these, t-SNE and its variants have become very popular for their ability to visually separate distinct data clusters. However, such methods are computationally expensive for large datasets, suffer from stability problems, and cannot directly handle out-of-sample data. We propose a learning approach to construct such projections. We train a deep neural network based on a collection of samples from a given data universe, and their corresponding projections, and next use the network to infer projections of data from the same, or similar, universes. Our approach generates projections with similar characteristics as the learned ones, is computationally two to three orders of magnitude faster than SNE-class methods, has no complex-to-set user parameters, handles out-of-sample data in a stable manner, and can be used to learn any projection technique. We demonstrate our proposal on several real-world high dimensional datasets from machine learning.
Ramp: Fast Frequent Itemset Mining with Efficient Bit-Vector Projection Technique
Bashir, Shariq, Baig, Abdul Rauf
Mining frequent itemset using bit-vector representation approach is very efficient for dense type datasets, but highly inefficient for sparse datasets due to lack of any efficient bit-vector projection technique. In this paper we present a novel efficient bit-vector projection technique, for sparse and dense datasets. To check the efficiency of our bit-vector projection technique, we present a new frequent itemset mining algorithm Ramp (Real Algorithm for Mining Patterns) build upon our bit-vector projection technique. The performance of the Ramp is compared with the current best (all, maximal and closed) frequent itemset mining algorithms on benchmark datasets. Different experimental results on sparse and dense datasets show that mining frequent itemset using Ramp is faster than the current best algorithms, which show the effectiveness of our bit-vector projection idea. We also present a new local maximal frequent itemsets propagation and maximal itemset superset checking approach FastLMFI, build upon our PBR bit-vector projection technique. Our different computational experiments suggest that itemset maximality checking using FastLMFI is fast and efficient than a previous will known progressive focusing approach.
- Asia > Pakistan > Islamabad Capital Territory > Islamabad (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Florida > Brevard County > Melbourne (0.04)
- (3 more...)